165 results found.
Written
Corpus,
Language Type:
Bilingual
Languages:
Dutch English
Availability:
Freely Available
License:
Size:
26 Dialogs OtherProduction Status:
Existing-used
Use:
Dialogue
-
Paper title:Mapping the Dialog Act Annotations of the LEGO Corpus into ISO 24617-2 Communicative Functions
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eugénio Ribeiro | The DialogBank | /N |
Documentation:
None
Multimodal/Multimedia
Image Analyzer,
Language Type:
Multilingual
Languages:
Dutch English French German Modern Greek Portuguese
Availability:
Freely Available
License:
European Union Public License 1.2
Size:
200 MByte Production Status:
Newly created-finished
Use:
Knowledge Discovery/Representation
-
Paper title:Immersive Language Exploration with Object Recognition and Augmented Reality
-
Paper track:Multimodality/poster presentation with demo
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Benny Platte | ARTranslate Open Source XCode Project | /N |
Documentation:
https://github.com/benpla/ARTranslate/blob/master/README.md
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
Adyghe Albanian Ancient Greek Arabic Armenian Asturian Basque Belarusian Bulgarian Catalan Church Slavic Classic Syriac Classical Armenian Czech Danish Dutch English Estonian Faroese Finnish Georgian German Gothic Hindi Hungarian Icelandic Ingrian Irish Kabardian Kalaallisut Kannada Kazakh Khakas Latin Latvian Lithuanian Livonian languages Low German Lower Sorbian Macedonian Maltese Middle French Middle High German Middle Low German Modern Greek Neapolitan Northern Sami Occitan Old English Old French Old Irish Old Saxon Pashto Persian Polish Portuguese Romanian Slovenian Spanish Swedish Tibetan Turkish Turkmen Ukrainian Urdu Veps Votic Welsh
Availability:
Freely Available
License:
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Size:
557.3 MByte Production Status:
Newly created-in progress
Use:
Morphological Analysis
-
Paper title:Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus
-
Paper track:Multimodality/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eleni Metheniti | Wikinflection Corpus | /N |
Documentation:
https://github.com/lenakmeth/Wikinflection-Corpus/blob/master/README.md
Written
Corpus,
Language Type:
Multilingual
Languages:
Dutch English French Portuguese
Availability:
Freely Available
License:
Apache-2.0
Size:
31403 translation units OtherProduction Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:A Post-Editing Dataset in the Legal Domain: Do we Underestimate Neural Machine Translation Quality?
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Julia Ive | Post-Editing Dataset in the Legal Domain | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Dutch
Availability:
Freely Available
License:
GNU GPLv3
Size:
8.4 MByte Production Status:
Newly created-finished
Use:
Named Entity Recognition
-
Paper title:Creating a Dataset for Named Entity Recognition in the Archaeology Domain
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Alex Brandsen | dutch-archaeo-NER-dataset | /N |
Documentation:
https://github.com/alexbrandsen/dutch-archaeo-NER-dataset/tree/v1.0
Written
Corpus,
Language Type:
Multilingual
Languages:
Dutch English German Swedish
Availability:
Part freely available, part through search interface
License:
mixed CC and "for research purposes after registration"
Size:
None tokens Production Status:
Newly created-in progress
Use:
historical linguistic research
-
Paper title:The EDGeS Diachronic Bible Corpus
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Gerlof Bouma | EDGeS Diachronic Bible Corpus | /N |
Documentation:
https://spraakbanken.gu.se/en/projects/complex-verb-constructions
Written
Corpus,
Language Type:
Monolingual
Languages:
Bulgarian Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hungarian Icelandic Irish Italian Latvian Lithuanian Maltese Polish Portuguese Romanian Slovak Slovenian Spanish Swedish
Availability:
Freely Available
License:
CC-0
Size:
341856530 sentences Production Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaCrawl: Web-Scale Acquisition of Parallel Corpora
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Philipp Koehn | ParaCrawl | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Dutch English French German Italian Polish Portuguese Spanish
Availability:
Freely Available
License:
CC BY 4.0
Size:
None Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
-
Paper track:8.1 Feature extraction and low-level feature model/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Laurent Besacier | Multilingual LibriSpeech (MLS) | /N |
Documentation:
https://arxiv.org/abs/2012.03411, English, public
Written
Lexicon,
Language Type:
Multilingual
Languages:
Dutch English German
Availability:
From Data Center(s)
License:
CELEX Agreement
Size:
None Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:Using LSTMs to Assess the Obligatoriness of Phonological Distinctive Features for Phonotactic Learning
-
Paper track:Long/Phonology, Morphology and Word Segmentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Nicole Mirea | CELEX2 | /N |
Documentation:
Yes, in English, publicly available
Written
Ontology,
Language Type:
Multilingual
Languages:
Dutch English French Italian
Availability:
Freely Available
License:
Size:
285 KByte Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Every Child Should Have Parents: A Taxonomy Refinement Algorithm Based on Hyperbolic Term Embeddings
-
Paper track:Short/Textual Inference and Other Areas of Semantics
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Rami Aly | SemEval-2016 Task 13 | /N |
Documentation:
None




